Machine learning based monitoring system

ABSTRACT

Systems and methods are provided for machine learning based monitoring. A current time is received. The system determines to begin a check-up process from the current time. In response to determining to begin the check-up process, a prompt to cause a person to perform a check-up activity is presented on a display. Image data of a recording of the check-up activity is received from the camera. The system invokes a screening machine learning model based on the image data. The screening machine learning model outputs a classification result. The system detects a potential screening issue based on the classification result. In response to detecting the potential screening issue, the system provides an alert.

INCORPORATION BY REFERENCE TO ANY PRIORITY APPLICATIONS

The present application claims benefit of U.S. Provisional ApplicationNo. 63/298,569 entitled “Intelligent Camera System” filed Jan. 11, 2022and U.S. Provisional Application No. 63/299,168 entitled “IntelligentCamera System” filed Jan. 13, 2022, the entirety of each of which ishereby incorporated by reference. Any and all applications for which aforeign or domestic priority claim is identified in the Application DataSheet as filed with the present application are hereby incorporated byreference under 37 CFR 1.57.

BACKGROUND

A smart camera system can be a machine vision system which, in additionto image capture capabilities, is capable of extracting information fromcaptured images. Some smart camera systems are capable of generatingevent descriptions and/or making decisions that are used in an automatedsystem. Some camera systems can be a self-contained, standalone visionsystem with a built-in image sensor. The vision system and the imagesensor can be integrated into a single hardware device. Some camerasystems can include communication interfaces, such as, but not limitedto Ethernet and/or wireless interfaces.

Safety can be important in clinical, hospice, assisted living, and/orhome settings. Potentially dangerous events can happen in theseenvironments. Automation can also be beneficial in these environments.

SUMMARY

The systems, methods, and devices described herein each have severalaspects, no single one of which is solely responsible for its desirableattributes. Without limiting the scope of this disclosure, severalnon-limiting features will now be discussed briefly.

According to an aspect, a system is disclosed comprising: a storagedevice configured to store first instructions and second instructions; acamera; a hardware accelerator configured to execute the firstinstructions; and a hardware processor configured to execute the secondinstructions to: receive, from the camera, first image data; invoke, onthe hardware accelerator, a person detection model based on the firstimage data, wherein the person detection model outputs firstclassification result; detect a person based on the first classificationresult; receive, from the camera, second image data; and in response todetecting the person, invoke, on the hardware accelerator, a falldetection model based on the second image data, wherein the falldetection model outputs a second classification result, detect apotential fall based on the second classification result, and inresponse to detecting the potential fall, provide an alert.

According to an aspect, the system may further comprise a microphone,wherein the hardware processor may be configured to execute furtherinstructions to: receive, from the microphone, audio data; and inresponse to detecting the person, invoke, on the hardware accelerator, aloud noise detection model based on the audio data, wherein the loudnoise detection model outputs a third classification result, and detecta potential scream based on the third classification result.

According to an aspect, the hardware processor may be configured toexecute additional instructions to: in response to detecting thepotential scream, provide a second alert.

According to an aspect, the hardware processor may be configured toexecute additional instructions to: in response to detecting thepotential fall and the potential scream, provide an escalated alert.

According to an aspect, invoking the loud noise detection model based onthe audio data may further comprise: generating spectrogram data fromthe audio data; and providing the spectrogram data as input to the loudnoise detection model.

According to an aspect, the second image data may comprise a pluralityof images.

According to an aspect, a method is disclosed comprising: receiving,from a camera, first image data; invoking, on a hardware accelerator, aperson detection model based on the first image data, wherein the persondetection model outputs first classification result; detecting a personbased on the first classification result; receiving, from the camera,second image data; and in response to detecting the person, invoking, onthe hardware accelerator, a plurality of person safety models based onthe second image data, for each person safety model from the pluralityof person safety models, receiving, from the hardware accelerator, asecond classification result, detecting a potential safety issue basedon a particular second classification result, and in response todetecting the potential safety issue, providing an alert.

According to an aspect, the method may further comprise: in response todetecting the person, invoking, on the hardware accelerator, a facialfeature extraction model based on the second image data, wherein thefacial feature extraction model outputs a facial feature vector,executing a query of a facial features database based on the facialfeature vector, wherein executing the query indicates that the facialfeature vector is not present in the facial features database, and inresponse to determining that the facial feature vector is not present inthe facial features database, providing an unrecognized person alert.

According to an aspect, the plurality of person safety models maycomprise a fall detection model, the method may further comprise:collecting a first set of videos of person falls; collecting a secondset of videos of persons without falling; creating a training data setcomprising the first set of videos and the second set of videos; andtraining the fall detection model using the training data set.

According to an aspect, the plurality of person safety models maycomprise a handwashing detection model, the method may further comprise:collecting a first set of videos of with handwashing; collecting asecond set of videos without handwashing; creating a training data setcomprising the first set of videos and the second set of videos; andtraining the handwashing detection model using the training data set.

According to an aspect, the method may further comprise: receiving, froma microphone, audio data; and in response to detecting the person,invoking, on the hardware accelerator, a loud noise detection modelbased on the audio data, wherein the loud noise detection model outputsa third classification result, and detecting a potential scream based onthe third classification result.

According to an aspect, the method may further comprise: in response todetecting the potential safety issue and the potential scream, providingan escalated alert.

According to an aspect, the method may further comprise: collecting afirst set of videos of with screaming; collecting a second set of videoswithout screaming; creating a training data set comprising the first setof videos and the second set of videos; and training the loud noisedetection model using the training data set.

According to an aspect, a system is disclosed comprising: a storagedevice configured to store first instructions and second instructions; acamera; a hardware accelerator configured to execute the firstinstructions; and a hardware processor configured to execute the secondinstructions to: receive, from the camera, first image data; invoke, onthe hardware accelerator, a person detection model based on the firstimage data, wherein the person detection model outputs firstclassification result; detect a person based on the first classificationresult; receive, from the camera, second image data; and in response todetecting the person, invoke, on the hardware accelerator, a pluralityof person safety models based on the second image data, for each personsafety model from the plurality of person safety models, receive, fromthe hardware accelerator, a model result, detect a potential safetyissue based on a particular model result, and in response to detectingthe potential safety issue, provide an alert.

According to an aspect, the plurality of person safety models maycomprise a fall detection model, and wherein invoking the plurality ofperson safety models may comprise: invoking, on the hardwareaccelerator, the fall detection model based on the second image data,wherein the fall detection model outputs the particular model result.

According to an aspect, the plurality of person safety models maycomprise a handwashing detection model, and wherein invoking theplurality of person safety models may comprise: invoking, on thehardware accelerator, the handwashing detection model based on thesecond image data, wherein the handwashing detection model outputs theparticular model result.

According to an aspect, the system may further comprise a microphone,wherein the hardware processor may be configured to execute furtherinstructions to: receive, from the microphone, audio data; and inresponse to detecting the person, invoke, on the hardware accelerator, aloud noise detection model based on the audio data, wherein the loudnoise detection model outputs a third classification result, detect apotential loud noise based on the third classification result, and inresponse to detecting the potential loud noise, provide a second alert.

According to an aspect, the system may further comprise a display,wherein the hardware processor may be configured to execute furtherinstructions to: cause presentation, on the display, of a prompt tocause a person to perform an activity; receive, from the camera, thirdimage data of a recording of the activity; invoke, on the hardwareaccelerator, a screening machine learning model based on the third imagedata, wherein the screening machine learning model outputs a thirdclassification result, detect a potential screening issue based on thethird classification result, and in response to detecting the potentialscreening issue, provide a second alert.

According to an aspect, the screening machine learning model may be apupillometry screening model, and wherein the potential screening issueindicates potential dilated pupils.

According to an aspect, the screening machine learning model may be afacial paralysis screening model, and wherein the potential screeningissue indicates potential facial paralysis.

According to an aspect, a system is disclosed comprising: a storagedevice configured to store first instructions and second instructions; awearable device configured to process sensor signals to determine aphysiological value for a person; a microphone; a camera; a hardwareaccelerator configured to execute the first instructions; and a hardwareprocessor configured to execute the second instructions to: receive,from the wearable device, the first physiological value; determine tobegin a monitoring process based on the first physiological value; andin response to determining to begin the monitoring process, receive,from the camera, image data; receive, from the microphone, audio data;invoke, on the hardware accelerator, a first unconscious detection modelbased on the image data, wherein the first unconscious detection modeloutputs a first classification result, invoke, on the hardwareaccelerator, a second unconscious detection model based on the audiodata, wherein the second unconscious detection model outputs a secondclassification result, detect a potential state of unconsciousness basedon the first classification result and the second classification result,and in response to detecting the potential state of unconsciousness,provide an alert.

According to an aspect, the wearable device may comprise a pulseoximetry sensor and the first physiological value is for blood oxygensaturation, and wherein determining to begin the monitoring processbased on the first physiological value further comprises: determiningthat the first physiological value is below a threshold level.

According to an aspect, the wearable device may comprise a respirationrate sensor and the first physiological value is for respiration rate,and wherein determining to begin the monitoring process based on thefirst physiological value further comprises: determining that the firstphysiological value satisfies a threshold alarm level.

According to an aspect, the wearable device comprises a heart ratesensor and the first physiological value is for heart rate, and whereindetermining to begin the monitoring process based on the physiologicalvalue further comprises: receiving, from the wearable device, aplurality of physiological values measuring heart rate over time; anddetermining that the plurality of physiological values and the firstphysiological value satisfies a threshold alarm level.

According to an aspect, a system is disclosed comprising: a storagedevice configured to store instructions; a display; a camera; and ahardware processor configured to execute the instructions to: receive acurrent time; determine to begin a check-up process from the currenttime; and in response to determining to begin the check-up process,cause presentation, on the display, of a prompt to cause a person toperform a check-up activity, receive, from the camera, image data of arecording of the check-up activity, invoke a screening machine learningmodel based on the image data, wherein the screening machine learningmodel outputs a classification result, detect a potential screeningissue based on the classification result, and in response to detectingthe potential screening issue, provide an alert.

According to an aspect, the screening machine learning model may be apupillometry screening model, and wherein the potential screening issueindicates potential dilated pupils.

According to an aspect, the screening machine learning model may be afacial paralysis screening model, and wherein the potential screeningissue indicates potential facial paralysis.

According to an aspect, the system may further comprise a wearabledevice configured to process sensor signals to determine a physiologicalvalue for the person, wherein the hardware processor may be configuredto execute further instructions to: receive, from the wearable device,the physiological value; and generate the alert comprising thephysiological value.

According to an aspect, the wearable device may comprise a pulseoximetry sensor and the physiological value is for blood oxygensaturation.

According to an aspect, the wearable device may be further configured toprocess the sensor signals to measure at least one of blood oxygensaturation, pulse rate, perfusion index, respiration rate, heart rate,or pleth variability index.

According to an aspect, the hardware processor may be configured toexecute further instructions to: receive, from a second computingdevice, first video data; cause presentation, on the display, of thefirst video data; receive, from the camera, second video data; andtransmit, to the second computing device, the second video data.

According to an aspect, a method is disclosed comprising: receiving acurrent time; determining to begin a check-up process from the currenttime; and in response to determining to begin the check-up process,causing presentation, on a display, of a prompt to cause a person toperform a check-up activity, receiving, from a camera, image data of arecording of the check-up activity, invoking a screening machinelearning model based on the image data, wherein the screening machinelearning model outputs a model result, detecting a potential screeningissue based on the model result, and in response to detecting thepotential screening issue, providing an alert.

According to an aspect, the screening machine learning model may be apupillometry screening model, and wherein the potential screening issueindicates potential dilated pupils, the method further comprise:collecting a first set of images of dilated pupils; collecting a secondset of images without dilated pupils; creating a training data setcomprising the first set of images and the second set of images; andtraining the pupillometry screening model using the training data set.

According to an aspect, the screening machine learning model may be afacial paralysis screening model, and wherein the potential screeningissue indicates potential facial paralysis, the method may furthercomprise: collecting a first set of images of facial paralysis;collecting a second set of images without facial paralysis; creating atraining data set comprising the first set of images and the second setof images; and training the facial paralysis screening model using thetraining data set.

According to an aspect, the check-up activity may comprise a dementiatest, and wherein the screening machine learning model may comprise agesture detection model.

According to an aspect, the gesture detection model may be configured todetect a gesture directed towards a portion of the display.

According to an aspect, the method may further comprise: receiving, fromthe camera, second image data; invoking a person detection model basedon the second image data, wherein the person detection model outputsfirst classification result; detect a person based on the firstclassification result; receive, from the camera, third image data; andin response to detecting the person, invoking a handwashing detectionmodel based on the third image data, wherein the handwashing detectionmodel outputs a second classification result, detecting a potential lackof handwashing based on the second classification result, and inresponse to detecting a lack of handwashing, provide a second alert.

According to an aspect, a system is disclosed comprising: a storagedevice configured to store instructions; a camera; and a hardwareprocessor configured to execute the instructions to: receive, from thecamera, first image data; invoke an infant detection model based on thefirst image data, wherein the infant detection model outputs aclassification result; detect an infant based on the classificationresult; receive captured data; and in response to detecting the infant,invoke an infant safety model based on the captured data, wherein theinfant safety model outputs a model result, detect a potential safetyissue based on the model result, and in response to detecting thepotential safety issue, provide an alert.

According to an aspect, the infant safety model may be an infantposition model, and wherein the potential safety issue indicates theinfant potentially laying on their stomach.

According to an aspect, the hardware processor may be configured toexecute further instructions to: receive, from the camera, second imagedata; and in response to detecting the infant, invoke a facial featureextraction model based on the second image data, wherein the facialfeature extraction model outputs a facial feature vector, execute aquery of a facial features database based on the facial feature vector,wherein executing the query indicates that the facial feature vector isnot present in the facial features database, and in response todetermining that the facial feature vector is not present in the facialfeatures database, provide an unrecognized person alert.

According to an aspect, the infant safety model may be an infant colordetection model, and wherein the potential safety issue indicatespotential asphyxiation.

According to an aspect, the model result may comprise coordinates of aboundary region identifying an infant object in the captured data, andwherein detecting the potential safety issue may comprise: determiningthat the coordinates of the boundary region exceed a threshold distancefrom an infant zone.

According to an aspect, the system may further comprise a wearabledevice configured to process sensor signals to determine a physiologicalvalue for the infant, wherein the hardware processor may be configuredto execute further instructions to: receive, from the wearable device,the physiological value; and generate the alert comprising thephysiological value.

According to an aspect, the system may further comprise a microphone,wherein the captured data is received from the microphone, wherein theinfant safety model is a loud noise detection model, and wherein thepotential safety issue indicates a potential scream.

In various aspects, systems and/or computer systems are disclosed thatcomprise a computer readable storage medium having program instructionsembodied therewith, and one or more processors configured to execute theprogram instructions to cause the one or more processors to performoperations comprising one or more of the above- and/or below-aspects(including one or more aspects of the appended claims).

In various aspects, computer-implemented methods are disclosed in which,by one or more processors executing program instructions, one or more ofthe above- and/or below-described aspects (including one or more aspectsof the appended claims) are implemented and/or performed.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, aspects, and advantages are described belowwith reference to the drawings, which are intended for illustrativepurposes and should in no way be interpreted as limiting. Furthermore,the various features described herein can be combined to form newcombinations, which are part of this disclosure. In the drawings, likereference characters can denote corresponding features. The following isa brief description of each of the drawings.

FIG. 1A is a drawing of a camera system in a clinical setting.

FIG. 1B is a schematic diagram illustrating a monitoring system.

FIG. 2 is a schematic drawing of a monitoring system in a clinicalsetting.

FIG. 3 is another schematic drawing of a monitoring system in a clinicalsetting.

FIG. 4 is a drawing of patient sensor devices that can be used in amonitoring system.

FIG. 5 illustrates a camera image with object tracking.

FIG. 6 is a drawing of a monitoring system in a home setting.

FIG. 7 is a drawing of a monitoring system configured for babymonitoring.

FIG. 8 is a flowchart of a method for efficiently applying machinelearning models.

FIG. 9 is a flowchart of another method for efficiently applying machinelearning models.

FIG. 10 is a flowchart of a method for efficiently applying machinelearning models for infant care.

FIG. 11 illustrates a block diagram of a computing device that mayimplement one or more aspects of the present disclosure.

DETAILED DESCRIPTION

As described above, some camera systems are capable of extractinginformation from captured images. However, extracting information fromimages and/or monitoring by existing camera systems can be limited.Technical improvements regarding monitoring people and/or objects andautomated actions based on the monitoring can advantageously be helpful,improve safety, and possibly save lives.

Generally described, aspects of the present disclosure are directed toimproved monitoring systems. In some aspects, a camera system caninclude a camera and a hardware accelerator. The camera system caninclude multiple machine learning models. Each model of the machinelearning models can be configured to detect an object and/or anactivity. The hardware accelerator can be special hardware that isconfigured to accelerate machine learning applications. The camerasystem can be configured to execute the machine learning models on thehardware accelerator. The camera system can advantageously be configuredto execute conditional logic to determine which machine learning modelsshould be applied and when. For example, until a person is detected inan area, the camera system may not apply any machine learning modelsrelated to persons, such as, but not limited to, fall detection, personidentification, stroke detection, medication tracking, activitytracking, etc.

Some existing monitoring systems can have limited artificialintelligence capabilities. For example, some existing monitoring systemsmay only have basic person, object, or vehicle detection. Moreover, someexisting monitoring systems may require a network connection from localcameras to backend servers that perform the artificial intelligenceprocessing. Some existing cameras may have limited or no artificialintelligence capabilities. Performing artificial intelligence processinglocally on cameras can be technically challenging. For example, thehardware processors and/or memory devices in existing cameras may be solimited as being unable to execute machine learning models locally.Moreover, existing cameras may have limited software to be able toexecute machine learning models locally in an efficient manner. Thesystems and methods described herein may efficiently process camera dataeither locally and/or in a distributed manner with machine learningmodels. Accordingly, the systems and methods described herein mayimprove over existing artificial intelligence monitoring technology.

As used herein, “camera” and “camera system” can be usedinterchangeably. Moreover, as used herein, “camera” and “camera system”can be used interchangeably with “monitoring system” since a camerasystem can encompass a monitoring system in some aspects.

FIG. 1A depicts a camera system 114 in a clinical setting 101. Theclinical setting 101 can be, but is not limited to, a hospital, nursinghome, or hospice. The clinical setting 101 can include the camera system114, a display 104, and a user computing device 108. In some aspects,the camera system 114 can be housed in a soundbar enclosure or atabletop speaker enclosure (not illustrated). The camera system 114 caninclude multiple cameras (such as 1080p or 4k camera and/or an infraredimage camera), an output speaker, an input microphone (such as amicrophone array), an infrared blaster, and/or multiple hardwareprocessors (including one or more hardware accelerators). In someaspects, the camera system 114 can have optical zoom. In some aspects,the camera system 114 can include a privacy switch that allows themonitoring system's 100A, 100B cameras to be closed. The camera system114 may receive voice commands. The camera system 114 can include one ormore hardware components for Bluetooth®, Bluetooth Low Energy (BLE),Ethernet, Wi-Fi, cellular (such as 4G/5G/LTE), near-field communication(NFC), radio-frequency identification (RFID), High-Definition MultimediaInterface (HDMI), and/or HDMI Consumer Electronics Control (CEC). Thecamera system 114 can be connected to the display 104 (such as atelevision) and the camera system 114 can control the display 104. Insome aspects, the camera system 114 can be wirelessly connected to theuser computing device 108 (such as a tablet). In particular, the camerasystem 114 can be wireles sly connected to a hub device and the hubdevice can be wirelessly connected to the user computing device 108.

The camera system 114 may include machine learning capabilities. Thecamera system 114 can include machine learning models. The machinelearning models can include, but are not limited to, convolutionalneural network (CNN) models and other models. A CNN model can be trainedto extract features from images for object identification (such asperson identification). In some aspects, a CNN can feed the extractedfeatures to a recurrent neural network (RNN) for further processing. Thecamera system 114 may track movements of individuals inside the roomwithout using any facial recognition or identification tag tracking.Identification tags can include, but are not limited to, badges and/orRFID tags. This feature allows the camera system 114 to track anindividual's movements even when the identification of the individual isunknown. A person in the room may not be identifiable for variousreasons. For example, the person may be wearing a mask so that facialrecognition modules may not be able to extract any features. As anotherexample, the person may be a visitor who is not issued an identificationtag, unlike the clinicians, who typically wear identification tags.Alternatively, when the person is not wearing a mask and/or is wearingan identification tag, the camera system 114 may combine the motiontracking with the identification of the individual to further improveaccuracy in tracking the activity of the individual in the room. Havingthe identity of at least one person in the room may also improveaccuracy in tracking the activity of other individuals in the room whoseidentity is unknown by reducing the number of anonymous individuals inthe room. Additional details regarding machine learning capabilities andmodels that the camera system 114 can use are provided herein.

The camera system 114 can be included in a monitoring system, asdescribed herein. The monitoring system can include remote interactioncapabilities. A patient in the clinical setting 101 can be in isolationdue to an illness, such as COVID-19. The patient can ask for assistancevia a button (such as by selecting an element in the graphical userinterface on the user computing device 108) and/or by issuing a voicecommand. In some aspects, the camera system 114 can be configured torespond to voice commands, such as, but not limited to, activating ordeactivating cameras or other functions. In response to the request, aremote clinician 106 can interact with the patient via the display 104and the camera system 114, which can include an input microphone and anoutput speaker. The monitoring system can also allow the patient toremotely maintain contact with friends and family via the display 104and camera system 114. In some aspects, the camera system 114 can beconnected to internet of things (IOT) devices. In some aspects, closingof the privacy switch can cause the camera system 114 and/or amonitoring system to disable monitoring. In other aspects, themonitoring system can still issue alerts if the privacy switch has beenclosed. In some aspects, the camera system 114 can record activity viacameras based on a trigger, such as, but not limited to, detection ofmotion via a motion sensor.

FIG. 1B is a diagram depicting a monitoring system 100A, 100B. In someaspects, there can be a home/assisted living side to the monitoringsystem 100A and a clinical side to the monitoring system 100B. Asdescribed herein, the clinical side monitoring system 100B can track andmonitor a patient via a first camera system 114 in a clinical setting.As described herein, the patient can be monitored via wearable sensordevices. A clinician 110 can interact with the patient via the firstdisplay 104 and the first camera system 114. Friends and family can alsouse a user computing device 102 to interact with the patient via thefirst display 104 and the first camera system 114.

The home/assisted living side monitoring system 100A can track andmonitor a person (which can be an infant) via a second camera system 134in a home/assisted living setting. For example, a person can berecovering at home or live in an assisted living home. As describedherein, the person can be monitored via wearable sensor devices. Aclinician 110 can interact with the person via the second display 124and the second camera system 134. As shown, the clinical side to themonitoring system 100B can securely communicate with the home/assistedliving side to the monitoring system 100A, which can allowcommunications between the clinician 110 and persons in the home orassisted living home. Friends and family can use the user computingdevice 102 to interact with the patient via the second display 124 andthe second camera system 134.

In some aspects, the monitoring system 100A, 100B can include server(s)130A, 130B. The server(s) 130A, 130B can facilitate communicationbetween the clinician 110 and a person via the second display 124 andthe second camera system 134. The server(s) 130A, 130B can facilitatecommunication between the user computing device 102 and the patient viathe first display 104 and the first camera system 114. As describedherein, the server(s) 130A, 130B can communicate with the camerasystem(s) 114, 134. In some aspects, the server(s) 130A, 130B cantransmit machine learning model(s) to the camera system(s) 114, 134. Insome aspects, the server(s) 130A, 130B can train machine learning modelsbased on training data sets.

In some aspects, the monitoring system 100A, 100B can present modifiedimages (which can be in a video format) to clinician(s) or othermonitoring users. For example, instead of showing actual persons, themonitoring system 100A, 100B can present images where a person has beenreplaced with a virtual representation (such as a stick figure) and/or aredacted area such as a rectangle.

FIG. 2 is a diagram depicting a monitoring system 200 in anotherclinical setting with an accompanying legend. The monitoring system 200can include, but is not limited to, cameras 272A, 272B, 280A, 280B, 286,290, 294, displays 292A, 292B, 292C, and a server 276. Some of thecameras 272A, 272B, 280A, 280B, 286, 290, 294 can be the same as orsimilar to the camera system 114 of FIG. 1A. The cameras 272A, 272B,280A, 280B, 286, 290, 294 can send data and/or images to the server 276.The server 276 can be located in the hospital room, or elsewhere in thehospital, or at a remote location outside the hospital (notillustrated). As shown, in a clinical setting, such as a hospital,hospitalized patients can be lying on hospital beds, such as thehospital bed 274. The bed cameras 272A, 272B can be near a head side ofthe bed 274 facing toward a foot side of the bed 274. The clinicalsetting may have a handwashing area 278. The handwashing cameras 280A,280B can face the handwashing area 278. The handwashing cameras 280A,280B can have a combined field of view 282C so as to maximize theability to detect a person's face and/or identification tag when theperson is standing next to the handwashing area 278 facing the sink. Viathe bed camera(s) 272A, 272B, the monitoring system 200 can detectwhether the clinician (or a visitor) is within a patient zone 275, whichcan be located within a field of view 282A, 282B of the bed camera(s)272A, 272B. Patient zones can be customized. For example, the patientzone 275 can be defined as a proximity threshold around the hospital bed274 and/or a patient. In some aspects, the clinician 281 is within thepatient zone 275 if the clinician is at least partially within aproximity threshold distance to the hospital bed and/or the patient.

The bed cameras 272A, 272B can be located above a head side of the bed274, where the patient's head would be at when the patient lies on thebed 274. The bed cameras 272A, 272B can be separated by a distance,which can be wider than a width of the bed 274, and can both be pointingtoward the bed 274. The fields of view 282A, 282B of the bed cameras272A, 272B can overlap at least partially over the bed 274. The combinedfield of view 282A, 282B can cover an area surrounding the bed 274 sothat a person standing by any of the four sides of the bed 274 can be inthe combined field of view 282A, 282B. The bed cameras 272A, 272B caneach be installed at a predetermined height and pointing downward at apredetermined angle. The bed cameras 272A, 272B can be configured so asto maximize the ability to detect the face of a person standing next toor near the bed 274, independent of the orientation of the person'sface, and/or the ability to detect an identification tag that is worn onthe person's body, for example, hanging by the neck, the belt, etc.Optionally, the bed cameras 272A, 272B need not be able to identify thepatient lying on the bed 274, as the identity of the patient istypically known in clinical and other settings.

In some aspects, the cameras 272A, 272B, 280A, 280B, 286, 290, 294 canbe configured, including but not limited to being installed at a heightand/or angle, to allow the monitoring system 200 to detect a person'sface and/or identification tag, if any. For example, at least some ofthe cameras 272A, 272B, 280A, 280B, 286, 290, 294 can be installed at aceiling of the room or at a predetermined height above the floor of theroom. The cameras 272A, 272B, 280A, 280B, 286, 290, 294 can beconfigured to detect an identification tag. Additionally oralternatively, the cameras 272A, 272B, 280A, 280B, 286, 290, 294 candetect faces, which can include extracting facial recognition featuresof the detected face, and/or to detect a face and the identification tagsubstantially simultaneously.

In some aspects, the monitoring system 200 can monitor one or moreaspects about the patient, the clinician 281, and/or zones. Themonitoring system 200 can determine whether the patient is in the bed274. The monitoring system 200 can detect whether the patient is withina bed zone, which can be within the patient zone 275. The monitoringsystem 200 can determine an angle of the patient in the bed 274. In someaspects, the monitoring system 200 can include a wearable, wirelesssensor device (not illustrated) that can track a patient's posture,orientation, and activity. In some aspects, a wearable, wireless sensordevice can include, but is not limited to, a Centroid® device by MasimoCorporation, Irvine, Calif.. The monitoring system 200 can determine howoften the patient has turned in the bed 274 and/or gotten up from thebed 274. The monitoring system 200 can detect turning and/or getting upbased on the bed zone and/or facial recognition of the patient. Themonitoring system 200 can detect whether the clinician 281 is within thepatient zone 275 or another zone. As described herein, the monitoringsystem 200 can detect whether the clinician 281 is present or notpresent via one or more methods, such as, but not limited to, facialrecognition, identification via an image of an identification tag,and/or RFID based tracking. Similarly, the monitoring system 200 candetect intruders that are unauthorized in one or more zones via one ormore methods, such as, but not limited to, facial recognition,identification via an image of an identification tag, and/or RFID basedtracking. In some aspects, the monitoring system 200 can issue an alertbased on one or more of the following factors: facial detection of anunrecognized face; no positive visual identification of authorizedpersons via identification tags; and/or no positive identification ofauthorized persons via RFID tags. In some aspects, the monitoring system200 can detect falls via one or more methods, such as, but not limitedto, machine-vision based fall detection and/or fall detection viawearable device, such as using accelerometer data. Any of the alertsdescribed herein can be presented on the displays 292A, 292B, 292C.

In some aspects, if the monitoring system 200 detects that the clinician281 is within the patient zone 275 and/or has touched the patient, thenthe system 200 can assign a “contaminated” status to the clinician 281.The monitoring system 200 can detect a touch action by detecting theactual act of touching by the clinician 281 and/or by detecting theclinician 281 being in close proximity, for example, within less than 1foot, 6 inches, or otherwise, of the patient. If the clinician 281 movesoutside the patient zone 275, then the monitoring system 200 can assigna “contaminated-prime” status to the clinician 281. If the clinician 281with the “contaminated-prime” status re-enters the same patient zone 275or enters a new patient zone, monitoring system 200 can output an alarmor warning. If the monitoring system 200 detects a handwashing activityby the clinician 281 with a “contaminated-prime” status, then themonitoring system 200 can assign a “not contaminated” status to theclinician 281.

A person may also be contaminated by entering contaminated areas otherthan a patient zone. For example, as shown in FIG. 2 , the contaminatedareas can include a patient consultation area 284. The patientconsultation area 284 can be considered a contaminated area with orwithout the presence of a patient. The monitoring system 200 can includea consultation area camera 286, which has a field of view 282D thatoverlaps with and covers the patient consultation area 284. Thecontaminated areas can further include a check-in area 288 that is nextto a door of the hospital room. Alternatively and/or additionally, thecheck-in area 288 can extend to include the door. The check-in area 288can be considered a contaminated area with or without the presence of apatient. The monitoring system 200 can include an entrance camera 290,which has a field of view 282E that overlaps with and covers thecheck-in area 288.

As shown in FIG. 2 , the monitoring system 200 can include an additionalcamera 294. Additional cameras may not be directed to any specificcontaminated and/or handwashing areas. For example, the additionalcamera 294 can have a field of view 282F that covers substantially anarea that a person likely has to pass when moving from one area toanother area of the hospital room, such as from the patient zone 275 tothe consultation area 284. Additional camera can provide data to theserver 276 to facilitate tracking of movements of the people in theroom.

FIG. 3 depicts a monitoring system 300 in another clinical setting. Themonitoring system 300 may monitor the activities of anyone present inthe room such as medical personnel, visitors, patients, custodians, etc.As described herein, the monitoring system 300 may be located in aclinical setting such as a hospital room. The hospital room may includeone or more patient beds 308. The hospital room may include anentrance/exit 329 to the room. The entrance/exit 329 may be the onlyentrance/exit to the room.

The monitoring system 300 can include a server 322, a display 316, oneor more camera systems 314, 318, 320, and an additional device 310. Thecamera systems 314, 318, 320 may be connected to the server 322. Theserver 322 may be a remote server. The one or more camera systems mayinclude a first camera system 318, a second camera system 320, and/oradditional camera systems 314. The camera systems 314, 318, 320 mayinclude one or more processors, which can include one or more hardwareaccelerators. The processors can be enclosed in an enclosure 313, 324,326 of the camera systems 314, 318, 320. In some aspects, the processorscan include, but are not limited to, an embedded processing unit, suchas an Nvidia® Jetson Xavier™ NX/AGX, that is embedded in an enclosure ofthe camera systems 314, 318, 320. The one or more processors may bephysically located outside of the room. The processors may includemicrocontrollers such as, but not limited to, ASICs, FPGAs, etc. Thecamera systems 314, 318, 320 may each include a camera. The camera(s)may be communication with the one or more processors and may transmitimage data to the processor(s). In some aspects, the camera systems 314,318, 320 can exchange data and state information with other camerasystems.

The monitoring system 300 may include a database. The database caninclude information relating to the location of items in the room suchas camera systems, patient beds, handwashing stations, and/orentrance/exits. The database can include locations of the camera systems314, 318, 320 and the items in the field of view of each camera system314, 318, 320. The database can further include settings for each of thecamera systems. Each camera system 314, 318, 320 can be associated withan identifier, which can be stored in the database. The server 322 mayuse the identifiers to configure each of the camera systems 314, 318,320.

As shown in FIG. 3 , the first camera system 318 can include a firstenclosure 324 and a first camera 302. The first enclosure 324 canenclose one or more hardware processors. The first camera 302 may be acamera capable of sensing depth and color, such as, but not limited to,an RGB-D stereo depth camera. The first camera 302 may be positioned ina location of the room to monitor the entire room or substantially allof the room. The first camera 302 may be tilted downward at a higherlocation in the room. The first camera 302 may be set up to minimizeblind spots in the field of view of the first camera 302. For example,the first camera 302 may be located in a corner of the room. The firstcamera 302 may be facing the entrance/exit 329 and may have a view ofthe entrance/exit 329 of the room.

As shown in FIG. 3 , the second camera system 320 can include a secondenclosure 326 (which can include one or more processors) and a secondcamera 304. The second camera 304 may be a RGB color camera.Alternatively, the second camera 304 may be an RGB-D stereo depthcamera. The second camera 304 may be installed over a hand hygienecompliance area 306. The hand hygiene compliance area 306 may include asink and/or a hand sanitizer dispenser. The second camera 304 may belocated above the hand hygiene compliance area 306 and may be pointdownwards toward the hand hygiene compliance area 306. For example, thesecond camera 304 may be located on or close to the ceiling and may havea view the hand hygiene compliance area 306 from above.

In a room of a relatively small size, the first and second camerasystems 318, 320 may be sufficient for monitoring the room. Optionally,for example, if the room is of a relatively larger size, the system 300may include any number of additional camera systems, such as a thirdcamera system 314. The third camera system 314 may include a thirdenclosure 313 (which can include one or more processors) and a thirdcamera 312. The third camera 312 of the third camera system 314 may belocated near the patient's bed 308 or in a corner of the room, forexample, a corner of the room that is different than (for example,opposite or diagonal to) the corner of the room where the first camera302 of the first camera system 318 is located. The third camera 312 maybe located at any other suitable location of the room to aid in reducingblind spots in the combined fields of view of the first camera 302 andthe second camera 304. The third camera 312 of the third camera system314 may have a field of view covering the entire room. The third camerasystem 314 may operate similarly to the first camera system 318, asdescribed herein.

The monitoring system 300 may include one or more additional devices310. The additional device 310 can be, but is not limited to, a patientmonitoring and connectivity hub, bedside monitor, or other patientmonitoring device. For example, the additional device 310 can be a Root®monitor by Masimo Corporation, Irvine, Calif.. Additionally oralternatively, the additional device 310 can be, but is not limited to,a display device of a data aggregation and/or alarm visualizationplatform. For example, the additional device 310 can be a display device(not illustrated) for the Uniview® platform by Masimo Corporation,Irvine, Calif. The additional device(s) 310 can include smartphones ortablets (not illustrated). The additional device(s) may be incommunication with the server 322 and/or the camera systems 318, 320,314.

The monitoring system 300 can output alerts on the additional device(s)310 and/or the display 316. The outputted alert may be any auditoryand/or visual signal. Outputted alerts can include, but are not limitedto, a fall alert, an unauthorized person alert, an alert that a patientshould be turned, or an alert that a person has not complied the handhygiene protocol. For example, someone outside of the room can benotified on an additional device 310 and/or the display 316 that anemergency has occurred in the room. In some aspects, the monitoringsystem 300 can provide a graphical user interface, which can bepresented on the display 316. A configuration user can configure themonitoring system 300 via the graphical user interface presented on thedisplay 316.

FIG. 4 depicts patient sensor devices 404, 406, 408 (such as a wearabledevice) and a user computing device 402 (which may not be drawn toscale) that can be used in a monitoring system. In some aspects, one ormore of the patient sensor devices 404, 406, 408 can be optionally usedin a monitoring system. Additionally or alternatively, patient sensordevices can be used with the monitoring system that are different thanthe devices 404, 406, 408 depicted in FIG. 4 . A patient sensor devicecan non-invasively measure physiological parameters from a fingertip,wrist, chest, forehead, or other portion of the body. The first, second,and third patient sensor devices 404, 406, 408 can be wirelesslyconnected to the user computing device 402 and/or a server in themonitoring system. The first patient sensor device 404 can include adisplay and a touchpad and/or touchscreen. The first patient sensordevice 404 can be a pulse oximeter that is designed to non-invasivelymonitor patient physiological parameters from a fingertip. The firstpatient sensor device 404 can measure physiological parameters such as,but not limited to, blood oxygen saturation, pulse rate, perfusionindex, respiration rate, heart rate, and/or pleth variability index. Thefirst patient sensor device 404 can be a MightyS at® fingertip pulseoximeter by Masimo Corporation, Irvine, Calif. The second patient sensordevice 406 can be configured to be worn on a patient's wrist tonon-invasively monitor patient physiological parameters from a wrist.The second patient sensor device 406 can be a smartwatch. The secondpatient sensor device 406 can include a display and/or touchscreen. Thesecond patient sensor device 406 can measure physiological parametersincluding, but not limited to, blood oxygen saturation, pulse rate,perfusion index, respiration rate, heart rate, and/or pleth variabilityindex. The third patient sensor device 408 can be a temperature sensorthat is designed to non-invasively monitor physiological parameters of apatient. In particular, the third patient sensor device 408 can measurea temperature of the patient. The third patient sensor device 408 can bea Radius T°™ sensor by Masimo Corporation, Irvine, Calif. A patient,clinician, or other authorized user can use the user computing device408 to view physiological information and other information from themonitoring system.

As shown, a graphical user interface can be presented on the usercomputing device 402. The graphical user interface can presentphysiological parameters that have been measured by the patient sensordevices 404, 406, 408. As described herein, the graphical user interfacecan also present alerts and information from the monitoring system. Thegraphical user interface can present alerts such as, but not limited to,a fall alert, an unauthorized person alert, an alert that a patientshould be turned, or an alert that a person has not complied the handhygiene protocol.

FIG. 5 illustrates a camera image 500 with object tracking. Themonitoring system can track the persons 502A, 502B, 502C in the cameraimage 500 with the boundary regions 504, 506, 508. In some aspects, eachcamera system in a monitoring system can be configured to perform objectdetection. As described herein, some monitoring systems can have asingle camera system while other monitoring systems can have multiplecamera systems. Each camera system can be configured with multiplemachine learning models for object detection. A camera system canreceive image data from a camera. The camera can capture a sequence ofimages (which can be referred to as frames). The camera system canprocess the frame with a YOLO (You Only Look Once) deep learningnetwork, which can be trained to detect objects (such as persons 502A,502B, 502C) and return coordinates of the boundary regions 504, 506,508. In some aspects, the camera system can process the frame with aninception CNN, which can be trained to detect activities, such as handsanitizing or hand washing (not illustrated). The machine learningmodels, such as the inception CNN, can be trained using a dataset of aparticular activity type, such as handwashing or hand sanitizingdemonstration videos, for example.

The camera system can determine processed data that consists of theboundary regions 504, 506, 508 surrounding a detected person 502A, 502B,502C in the room, such as coordinates of the boundary regions. Thecamera system can provide the boundary regions to a server in themonitoring system. In some aspects, processed data may not include theimages captured by the camera. Advantageously, the images from thecamera can be processed locally at the camera system and may not betransmitted outside of the camera system. In some aspects, themonitoring system can ensure anonymity and protect privacy of imagedpersons by not transmitting the images outside of each camera system.

The camera system can track objects using the boundary regions. Thecamera system can compare the intersection of boundary regions inconsecutive frames. A sequence of boundary regions associated with anobject through consecutive frames can be referred to as a “track.” Thecamera system may associate boundary regions if the boundary regions ofconsecutive frames overlap by a threshold distance or are within of athreshold distance of another. The camera system may determine thatboundary regions from consecutive frames that are adjacent (or theclosest with each other) are associated with the same object. Thus,whenever object detection occurs in the field of view of one camera,that object may be associated with the nearest track.

As described herein, the camera system can use one or more computervision algorithms. For example, a computer vision algorithm can identifya boundary region around a person's face or around a person's body. Insome aspects, the camera system can detect faces using a machinelearning model, such as, but not limited to, Google's FaceNet. Themachine learning model can receive an image of the person's face asinput and output a vector of numbers, which can represent features of aface. In some aspects, the camera system can send the extracted facialfeatures to the server. The monitoring system can map the extractedfacial features to a person. The vector numbers can represent facialfeatures corresponding to points on ones' face. Facial features of knownpeople (such as clinicians or staff) can be stored in a facial featuresdatabase, which can be part of the database described herein. Toidentify an unknown individual, such as a new patient or a visitor, themonitoring system can initially mark the unknown person as unknown andsubsequently identify the same person in multiple camera images. Themonitoring system can populate a database with the facial features ofthe new person.

FIG. 6 depicts a monitoring system 600 in a home setting. The monitoringsystem 600 can include, but is not limited to, one or more cameras 602,604, 606. Some of the cameras, such as a first camera 602 of themonitoring system 600, can be the same as or similar to the camerasystem 114 of FIG. 1A. In some aspects, the cameras 602, 604, 606 cansend data and/or images to a server (not illustrated). The monitoringsystem 600 can be configured to detect a pet 610 using the objectidentification techniques described herein. The monitoring system 600can be further configured to determine if a pet 610 was fed or if thepet 610 is chewing or otherwise damaging the furniture 612. In someaspects, the monitoring system 600 can be configured to communicate witha home automation system. For example, if the monitoring system 600detects that the pet 610 is near a door, the monitoring system 600 caninstruct the home automation system to open the door. In some aspects,the monitoring system 600 can provide alerts and/or commands in the homesetting to deter a pet from some activity (such as biting a couch, forexample).

FIG. 7 depicts a monitoring system 700 in an infant care setting. Themonitoring system 700 can include one or more cameras 702. In someaspects, a camera in the monitoring system 700 can send data and/orimages to a server (not illustrated). The monitoring system 700 can beconfigured to detect an infant 704 using the object identificationtechniques described herein. Via a camera, the monitoring system 700 candetect whether a person is within an infant zone, which can be locatedwithin a field of view of the camera 702. Infant zones can be similar topatient zones, as described herein. For example, an infant zone can bedefined as a proximity threshold around a crib 706 and/or the infant704. In some aspects, a person is within the infant zone if the personis at least partially within a proximity threshold distance to the crib706 and/or the infant 704. The monitoring system 700 can use objecttracking, as described herein, to determine if the infant 704 is moved.For example, the monitoring system 700 can issue an alert if the infant704 leaves the crib 706. The monitoring system 700 can include one ormore machine learning models.

The monitoring system 700 can detect whether an unauthorized person iswithin the infant zone. The monitoring system 700 can detect whether anunauthorized person is present using one or more methods, such as, butnot limited to, facial recognition, identification via an image of anidentification tag, and/or RFID based tracking. Identification tagtracking (whether an identification badge, RFID tracking, or some othertracking) can be appliable to hospital-infant settings. In some aspects,the monitoring system 700 can issue an alert based on one or more of thefollowing factors: facial detection of an unrecognized face; no positivevisual identification of authorized persons via identification tags;and/or no positive identification of authorized persons via RFID tags.

As described herein, a machine learning model of the monitoring system700 can receive an image of a person's face as input and output a vectorof numbers, which can represent features of a face. The monitoringsystem 700 can map the extracted facial features to a known person. Forexample, a database of the monitoring system 700 can store a mappingfrom facial features (but not actual pictures of faces) to personprofiles. If the monitoring system 700 cannot match the features tofeatures from a known person, the monitoring system 700 can mark personas unknown and issue an alert. Moreover, the monitoring system 700 canissue another alert if the unknown person moves the infant 704 outsideof a zone.

In some aspects, the monitoring system 700 can monitor movements of theinfant 704. The monitoring system 700 can monitor a color of the infantfor physiological concerns. For example, the monitoring system candetect a change in color of skin (such as a bluish color) since thatmight indicate potential asphyxiation. The monitoring system 700 can usetrained machine learning models to identify skin color changes. Themonitoring system 700 can detect a position of the infant 704. Forexample, if the infant 704 rolls onto their stomach, the monitoringsystem 700 can issue a warning since it may be safer for the infant 704to lay on their back. The monitoring system 700 can use trained machinelearning models to identify potentially dangerous positions. In someaspects, a non-invasive sensor device (not illustrated) can be attachedto the infant 704 (such as a wristband or a band that wraps around theinfant's foot) to monitor physiological parameters of the infant. Themonitoring system 700 can receive the physiological parameters, such as,but not limited to, blood oxygen saturation, pulse rate, perfusionindex, respiration rate, heart rate, and/or pleth variability index. Insome aspects, the monitoring system 700 can include a microphone thatcan capture audio data. The monitoring system 700 can detect sounds fromthe infant 704, such as crying. The monitoring system 700 can issue analert if the detected sounds are above a threshold decibel level.Additionally or alternatively, the monitoring system 700 can process thesounds with a machine learning model. For example, the monitoring system700 can convert sound data into spectrograms, input them into a CNN anda linear classifier model, and output a prediction whether the sounds(such as excessive crying) should cause a warning to be issued. In someaspects, the monitoring system 700 can include a thermal camera. Themonitoring system 700 can use trained machine learning models toidentify a potentially wet diaper from an input thermal image.

Efficient Machine Learning Model Application

FIG. 8 is a flowchart of a method 800 for efficiently applying machinelearning models, according to some aspects of the present disclosure. Asdescribed herein, a monitoring system, which can include a camerasystem, may implement aspects of the method 800 as described herein. Themethod 800 may include fewer or additional blocks and/or the blocks maybe performed in order different than is illustrated.

Beginning at block 802, image data can be received. A camera system(such as the camera systems 114, 318 of FIGS. 1A, 3 described herein)can receive image data from a camera. Depending on the type of cameraand configuration of the camera, the camera system can receive differenttypes of images, such as 4K, 1080p, 8MP images. Image data can alsoinclude, but is not limited to, a sequence of images. A camera in acamera system can continuously capture images. Therefore, the camera ina camera system can capture images of objects (such as a patient, aclinician, an intruder, the elderly, an infant, a youth, or a pet) in aroom either at a clinical facility, a home, or an assisted living home.

At block 806, a person detection model can be applied. The camera systemcan apply the person detection model based on the image data. In someaspects, the camera system can invoke the person detection model on ahardware accelerator. The hardware accelerator can be configured toaccelerate the application of machine learning models, including aperson detection model. The person detection model can be configured toreceive image data as input. The person detection model can beconfigured to output a classification result. In some aspects, theclassification result can indicate a likelihood (such as a percentagechance) that the image data includes a person. In other aspects, theclassification result can be a binary result: either the object ispredicted as present in the image or not. The person detection model canbe, but is not limited to, a CNN. The person detection model can betrained to detect persons. For example, the person detection model canbe trained with a training data set with labeled examples indicatingwhether the input data includes a person or not.

At block 808, it can be determined whether a person is present. Thecamera system can determine whether a person is present. The camerasystem can determine whether a person object is located in the imagedata. The camera system can receive from the person detection model(which can execute on the hardware accelerator) the output of aclassification result. In some aspects, the output can be a binaryresult, such as, “yes” there is a person object present or “no” there isnot a person object present. In other aspects, the output can be apercentage result and the camera system can determine the presence of aperson if the percentage result is above a threshold. If a person isdetected, the method 800 proceeds to the block 810 to receive secondimage data. If a person is not detected, the method 800 proceeds torepeat the previous blocks 802, 806, 808 to continue checking forpersons.

At block 810, second image data can be received. The block 810 forreceiving the second image data can be similar to the previous block forreceiving image data. Moreover, the camera in the camera system cancontinuously capture images, which can lead to the second image data. Asdescribed herein, the image data can include multiple images, such as asequence of images.

At block 812, one or more person safety models can be applied. Inresponse to detecting a person, the camera system can apply one or moreperson safety models. The camera system can invoke (which can be invokedon a hardware accelerator) a fall detection model based on the secondimage data. The fall detection model can output a classification result.In some aspects, the fall detection model can be or include a CNN. Thecamera system can pre-process the image data. In some aspects, thecamera system can covert an image into an RGB image, which can be am-by-n-by-3 data array that defines red, green, and blue colorcomponents for each individual pixel in the image. In some aspects, thecamera system can compute an optical flow from the image data (such asthe RGB images), which can be a two-dimensional vector field between twoimages. The two-dimensional vector field can show how the pixels of anobject in the first image move to form the same object in the secondimage. The fall detection model can be pre-trained to perform featureextraction and classification of the image data (which can bepre-processed image data) to output a classification result. In someaspects, the fall detection model can be made of various layers, suchas, but not limited to, a convolution layer, a max pooling layer, and aregularization layer, and a classifier, such as, but not limited to, asoftmax classifier.

As described herein, in some aspects, an advantage of performing theprevious blocks 802, 806, 808 for checking whether a person is presentis that more computationally expensive operations, such as applying oneor more person safety models, can be delayed until a person is detected.The camera system can invoke (which can be invoked on a hardwareaccelerator) multiple person safety models based on the second imagedata. For each person safety model that is invoked, the camera systemcan receive a model result, such as but not limited to, a classificationresult. As described herein, the person safety models can include a falldetection model, a handwashing detection model, and/or an intruderdetection model.

At block 814, it can be determined whether there is a person safetyissue. The camera system can determine whether there is a person safetyissue. As described above, for each person safety model that is invoked,the camera system can receive a model result as output. For some models,the output can be a binary result, such as, “yes” a fall has beendetected or “no” a fall has not been detected. For other models, theoutput can be a percentage result and the camera system can determine aperson safety issue exists if the percentage result is above athreshold. In some aspects, evaluation of the one or more person safetymodels can result in an issue detection if at least one model returns aresult that indicates issue detection. If a person safety issue isdetected, the method 800 proceeds to block 816 to provide an alertand/or take an action. If a person safety issue is not detected, themethod 800 proceeds to repeat the previous blocks 802, 806, 808 tocontinue checking for persons.

At block 816, an alert can be provided and/or an action can be taken. Insome aspects, the camera system can initiate an alert. The camera systemcan notify a monitoring system to provide an alert. In some aspects, auser computing device 102 can receive an alert about a safety issue. Insome aspects, a clinician 110 can receive an alert about the safetyissue. In some aspects, the camera system can initiate an action. Thecamera system can cause the monitoring system to take an action. Forexample, the monitoring system can automatically notify emergencyservices (such as an emergency hotline and/or an ambulance service) tosend someone to help.

FIG. 9 is a flowchart of another method 900 for efficiently applyingmachine learning models, according to some aspects of the presentdisclosure. As described herein, a monitoring system, which can includea camera system, may implement aspects of the method 900 as describedherein. The method 900 may include fewer or additional blocks and/or theblocks may be performed in order different than is illustrated. Theblock(s) of the method 900 of FIG. 9 can be similar to the block(s) ofthe method 800 of FIG. 8 . In some aspects, the block(s) of the method900 of FIG. 9 can be used in conjunction with the block(s) of the method800 of FIG. 8 .

Beginning at block 902, a training data set can be received. Themonitoring system can receive a training data set. In some aspects, afirst set of videos of person falls can be collected and a second set ofvideos of persons without falling can be collected. A training data setcan be created with the first set of videos and the second set ofvideos. Other training data sets can be created. For example, formachine learning of handwashing, a first set of videos of withhandwashing and a second set of videos without handwashing can becollected; and a training data set can be created from the first set ofvideos and the second set of videos. For machine learning detection ofdilated pupils, a first set of images of with dilated pupils and asecond set of images without dilated pupils can be collected; and atraining data set can be created from the first set of images and thesecond set of images. For machine learning detection of facialparalysis, a first set of images of with facial paralysis and a secondset of images without facial paralysis can be collected; and a trainingdata set can be created from the first set of images and the second setof images. For machine learning detection of an infant, a first set ofimages of with an infant and a second set of images without an infantcan be collected; and a training data set can be created from the firstset of images and the second set of images. For machine learningdetection of an infant's position, a first set of images of an infant ontheir back and a second set of images of an infant on their stomach ortheir side; and a training data set can be created from the first set ofimages and the second set of images. For machine learning detection ofan unconsciousness state, a first set of videos of persons in anunconscious state and a second set of videos of a person in a state ofconsciousness; and a training data set can be created from the first setof videos and the second set of videos. For other machine learningdetection of an unconsciousness state, a first set of audio recordingsof persons in an unconscious state and a second set of audio recordingsof a person in a state of consciousness; and a training data set can becreated from the first set of audio recordings and the second set ofaudio recordings. The monitoring system can receive training data setsfor any of the machine learning models described herein that can betrained with supervised machine learning.

At block 904, a machine learning model can be trained. The monitoringsystem can train one or more machine learning models. The monitoringsystem can train a fall detection model using the training data set fromthe previous block 902. The monitoring system can train a handwashingdetection model using the training data set from the previous block 902.The monitoring system can train any of the machine learning modelsdescribed herein that use supervised machine learning.

In some aspects, the monitoring system can train a neural network, suchas, but not limited to, a CNN. The monitoring system can initiate theneural network with random weights. During the training of the neuralnetwork, the monitoring system feeds labelled data from the trainingdata set to the neural network. Class labels can include, but are notlimited to, fall, no fall, hand washing, no hand washing, loud noise, noloud noise, normal pupils, dilated pupils, no facial paralysis, facialparalysis, infant, no infant, supine position, prone position, sideposition, unconscious, conscious, etc. The neural network can processeach input vector with its values being assigned randomly and then makecomparisons with the class label of the input vector. If the outputprediction does not match the class label, an adjustment to the weightsof the neural network neurons are made so that output correctly matchesthe class label. The corrections to the value of weights can be madethrough a technique, such as, but not limited to backpropagation. Eachrun of training of the neural network can be called an “epoch.” Theneural network can go through several series of epochs during theprocess of training, which results in further adjusting of the neuralnetwork weights. After each epoch step, the neural network can becomemore accurate at classifying and correctly predicting the class of thetraining data. After training the neural network, the monitoring systemcan use a test dataset to verify the neural network's accuracy. The testdataset can be a set of labelled test data that were not included in thetraining process. Each test vector can be fed to the neural network, andthe monitoring system can compare the output to the actual class labelof the test input vector.

At block 906, input data can be received. The camera system can receiveinput data. In some aspects, the block 906 for receiving input data canbe similar to the block 802 of FIG. 8 for receiving image data. Thecamera system can receive image data from a camera. In some aspects,other input data can be received. For example, the camera system canreceive a current time. The camera system can receive an RFID signal(which can be used for identification purposes, as described herein).The camera system can receive physiological values (such as blood oxygensaturation, pulse rate, perfusion index, respiration rate, heart rate,and/or pleth variability index) from a patient sensor device, such as awearable device.

At block 908, it can be determined whether a trigger has been satisfied.The camera system can determine whether a trigger has been satisfied toapply one or more machine learning models. In some aspects, the camerasystem can determine whether a trigger has been satisfied by checkingwhether a person has been detected. In some aspects, the camera systemcan determine whether a trigger has been satisfied by checking whetherthe current time satisfies a trigger time window, such as, but notlimited to, a daily time check-up window. If a trigger is satisfied, themethod 900 proceeds to the block 910 to receive captured data. If atrigger is not detected, the method 900 proceeds to repeat the previousblocks 906, 908 to continue checking for triggers.

In some aspects, a trigger can be determined based on a receivedphysiological value. The camera system can determine to begin amonitoring process based on a physiological value. In some aspects, thewearable device can include a pulse oximetry sensor and thephysiological value is for blood oxygen saturation. The camera systemcan determine that the physiological value is below a threshold level(such as blood oxygen below 88%, 80%, or 70%, etc.). In some aspects,the wearable device can include a respiration rate sensor and thephysiological value is for respiration rate. The camera system candetermine that the physiological value satisfies a threshold alarm level(such as respiration rate under 12 or over 25 breaths per minute). Insome aspects, the wearable device can include a heart rate sensor, thephysiological value is for heart rate, and the multiple physiologicalvalues measuring heart rate over time can be received from the wearabledevice. The camera system can determine that the physiological valuessatisfies a threshold alarm level, such as, but not limited to, heartrate being above 100 beats per minute for a threshold period of time orunder a threshold level for threshold period of time.

At block 910, captured data can be received. The block 910 for receivingcaptured data can be similar to the previous block 906 for receivinginput data. Moreover, the camera in the camera system can continuouslycapture images, which can lead to the captured data. In some aspects,the camera system can receive audio data from a microphone. In someaspects, the camera system can be configured to cause presentation, on adisplay, of a prompt to cause a person to perform an activity. Thecamera system can receive, from a camera, image data of a recording ofthe activity.

At block 912, one or more machine learning models can be applied. Inresponse to determining that a trigger has been satisfied , the camerasystem can apply one or more machine learning models based on thecaptured data. The camera system can invoke (which can be invoked on ahardware accelerator) one or more machine learning models, which canoutput a model result. The camera system can invoke a fall detectionmodel based on image data where the fall detection model can output aclassification result. The camera system can invoke a loud noisedetection model based on the audio data where the loud noise detectionmodel can output a classification result. In some aspects, the camerasystem can generate a spectrogram data from the audio data and providethe spectrogram data as input to the loud noise detection model. Thecamera system can invoke a facial feature extraction model based on theimage data where the facial feature extraction model can output a facialfeature vector. The camera system can invoke a handwashing detectionmodel based on the image data where the handwashing detection model canoutput a classification result. The camera system can invoke a screeningmachine learning model based on image data where the screening machinelearning model can output a model result. The screening machine learningmodel can include, but is not limited to, a pupillometry screening modelor a facial paralysis screening model.

In some aspects, in response to determining to begin the monitoringprocess, the camera system can invoke one or more machine learningmodels. The camera system can invoke (which can be on a hardwareaccelerator) a first unconscious detection model based on the image datawhere the first unconscious detection model outputs a firstclassification result. The camera system can invoke (which can be on thehardware accelerator) a second unconscious detection model based on theaudio data where the second unconscious detection model outputs a secondclassification result.

At block 914, it can be determined whether there is a safety issue. Thecamera system can determine whether there is a safety issue. For eachmachine learning model that is invoked, the camera system can receive aclassification result as output. For some models, the output can be abinary result, such as, “yes” a fall has been detected or “no” a fallhas not been detected. For other models, the output can be a percentageresult and the camera system can determine a safety issue exists if thepercentage result is above a threshold. In some aspects, evaluation ofthe one or more machine learning models can result in an issue detectionif at least one model returns a result that indicates issue detection.The camera system can detect a potential fall based on theclassification result. The camera system can detect a potential screamor loud noise based on the classification result from a loud noisedetection model. The camera system can execute a query of a facialfeatures database based on the facial feature vector where executing thequery can indicate that the facial feature vector is not present in afacial features database, which can indicate a safety issue. The camerasystem can detect a potential screening issue based on theclassification result. The potential screening issue can indicate, butis not limited to, potential dilated pupils or potential facialparalysis. In some aspects, based on the output from one or more machinelearning models, the camera system can detect a potential state ofunconsciousness. If a safety issue is detected, the method 900 proceedsto block 916 to provide an alert and/or take an action. If a safetyissue is not detected, the method 900 proceeds to repeat the previousblocks 906, 908 to continue checking for triggers.

At block 916, an alert can be provided and/or an action can be taken. Insome aspects, the camera system can initiate an alert. The camera systemcan notify a monitoring system to provide an alert. In some aspects, thecamera system can initiate an action. In some aspects, the block 916 forproviding an alert and/or taking an action can be similar to the block816 of FIG. 8 for providing an alert and/or taking an action. Inresponse to detecting an issue, such as, but not limited to, detecting apotential fall, loud noise, scream, lack of handwashing, dilated pupils,facial paralysis, intruder, state of unconsciousness, etc., themonitoring system can provide an alert. The monitoring system canescalate alerts. For example, in response to detecting a potential falland a potential scream or loud noise, the monitoring system can providean escalated alert. The camera system can cause the monitoring system totake an action. For example, the monitoring system can automaticallynotify emergency services (such as an emergency hotline and/or anambulance service) to send someone to help.

In some aspects, the monitoring system can allow privacy options. Forexample, some user profiles can specify that the user computing devicesassociated with those profiles should not receive alerts (which can bespecified for a period of time). However, the monitoring system caninclude an alert escalation policy such that alerts can be presented viauser computing devices based on one or more escalation conditions. Forexample, if an alert isn't responded to a for a period of time, themonitoring system can escalate the alert. As another example, if aquantity of alerts exceed a threshold, then the monitoring system canpresent an alert via user computing devices despite user preferencesotherwise.

At block 918, a communications system can be provided. The monitoringsystem can provide a communications system. The camera system canreceive, from a computing device, first video data (such as, but notlimited to, video data of a clinician, friends, or family of a patient).The camera system can cause presentation, on the display, of the firstvideo data. The camera system can receive, from the camera, second videodata and transmit, to the computing device, the second video data.

Elderly Care Features

Some of the aspects described herein can be directed towards elderlycare features. The monitoring systems described herein can be applied toassisted living and/or home settings for the elderly. The monitoringsystems described herein, which can include camera systems, cangenerally monitor activities of the elderly. The monitoring systemsdescribed herein can initiate check-up processes, including, but notlimited to, dementia checks. In some aspects, a check-up process candetect a color of skin to detect possible physiological changes. Themonitoring system can perform stroke detection by determining changes infacial movements and/or speech patterns. The monitoring system can trackmedication administration and provide reminders if medication is nottaken. For example, the monitoring can monitor a cupboard or medicinedrawer and determine whether medication is taken based on activity inthose areas. In some aspects, some of the camera systems can be outdoorcamera systems. The monitoring system can track when a person goes for awalk, log when the person leaves and returns, and potentially issues analert if a walk exceeds a threshold period of time. In some aspects, themonitoring system can track usage of good hygiene practices, such as butnot limited to, handwashing, brushing teeth, or showering (e.g.,tracking that a person enters a bathroom at a showering time). Themonitoring system can keep track of whether a person misses a check-up.In some aspects, a camera system can include a thermal camera, which canbe used to identify a potentially wet adult diaper from an input thermalimage.

With respect to FIG. 9 , the method 900 for efficiently applying machinelearning models can be applied to elderly care settings. At block 902, atraining data set can be received. The monitoring system can receive atraining data set, which can be used to train machine learning models tobe used in check-up processes for the elderly, such as checking fordilated pupils or facial paralysis. For machine learning of dilatedpupils, a first set of images of with dilated pupils and a second set ofimages without dilated pupils can be collected; and a training data setcan be created from the first set of images and the second set ofimages. For machine learning of facial paralysis, a first set of imagesof with facial paralysis and a second set of images without facialparalysis can be collected; and a training data set can be created fromthe first set of images and the second set of images.

At block 904, a machine learning model can be trained. A server in themonitoring system can train a pupillometry screening model using thetraining data set. The server in the monitoring system can train afacial paralysis screening model using the training data set.

At block 906, input data can be received. The camera system can receiveinput data, which can be used to determine if a trigger has beensatisfied for application of one or more machine learning models. Thecamera system can receive image data from a camera. The camera systemcan receive a current time. The camera system can receive an RFIDsignal, which can be used for person identification and/or detection.

In some aspects, the monitoring system can include patient sensordevices, such as, but not limited to, wearable devices. The wearabledevice can be configured to process sensor signals to determine aphysiological value for the person. The monitoring system can receive aphysiological value from the wearable device. In some aspects, thewearable device can include a pulse oximetry sensor and thephysiological value can be for blood oxygen saturation. In some aspects,the wearable device can be configured to process the sensor signals tomeasure at least one of blood oxygen saturation, pulse rate, perfusionindex, respiration rate, heart rate, or pleth variability index. Some ofthe wearable devices can be used for an infant.

At block 908, it can be determined whether a trigger has been satisfied.The camera system can determine whether a trigger has been satisfied toapply one or more machine learning models. The camera system candetermine whether a check-up process should begin from a current time.For example, the monitoring system can conduct check-up processes atregular intervals, such as once or two a day, which can be at particulartimes, such as a morning check-up time or an afternoon check-up time. Asdescribed herein, another trigger type can be detection of a person. Thecamera system can invoke a person detection model based on image datawhere the person detection model outputs classification result; anddetect a person based on the classification result. If a trigger issatisfied, the method 900 proceeds to the block 910 to receive captureddata. If a trigger is not detected, the method 900 proceeds to repeatthe previous blocks 906, 908 to continue checking for triggers.

At block 910, captured data can be received. In response to determiningto begin the check-up process, the monitoring system can causepresentation, on a display, of a prompt to cause a person to perform acheck-up activity. In some aspects, the check-up activity can check forsigns of dementia. A check-up activity can include having a personstanding a particular distance from the camera system. A check-upactivity can include simple exercises. The prompts can cause a user tosay something or perform tasks. The person can be prompted to performmath tasks, pattern recognition, solve puzzles, and/or identify photosof family members. For example, the person can be prompted to point tosections of the display, which can correspond to answers to check-uptests. The check-up tests can check for loss of motor skills. In someaspects, the check-up activity can include a virtual physical orappointment conducted by a clinician. The camera system can receive,from the camera, image data of a recording of the check-up activity. Insome aspects, the camera system can receive other input, such as, butnot limited to, audio data from a microphone.

At block 912, one or more machine learning models can be applied. Inresponse to determining that a trigger has been satisfied , the camerasystem can apply one or more machine learning models based on thecaptured data. In some aspects, in response to determining to begin thecheck-up process, the camera system can invoke a screening machinelearning model based on image data where the screening machine learningmodel can output a model result (such as a classification result). Thescreening machine learning model can include, but is not limited to, apupillometry screening model, a facial paralysis screening model, or agesture detection model. The gesture detection model can be configuredto detect a gesture directed towards a portion of the display. Forexample, during a dementia test, the person can be prompted to point toa portion of the display and the gesture detection model can identify apoint gesture, such as but not limited to, pointing to a quadrant on thedisplay. In some aspects, in response to detecting a person, the camerasystem can invoke a handwashing detection model based on image datawherein the handwashing detection model outputs a classification result.

At block 914, it can be determined whether there is a safety issue. Thecamera system can determine whether there is a safety issue, such as apotential screening issue. The camera system can detect a potentialscreening issue based on the model result. The potential screening issuecan indicate, but is not limited to, potential dilated pupils orpotential facial paralysis. The monitoring system can determine whetherthere is a potential screening issue based on output from a gesturedetection model. For example, the monitoring system can use detectedgesture to determine an answer and an incorrect answer can indicate apotential screening issue. If a safety issue is detected, the method 900proceeds to block 916 to provide an alert and/or take an action. If asafety issue is not detected, the method 900 proceeds to repeat theprevious blocks 906, 908 to continue checking for triggers.

At block 916, an alert can be provided. In some aspects, the camerasystem can initiate an alert. The camera system can notify a monitoringsystem to provide one or more alerts. In response to detecting an issuein an elderly care setting, such as, but not limited to, detecting apotential fall, loud noise, scream, lack of handwashing, dilated pupils,facial paralysis, intruder, etc., the monitoring system can provide analert. The monitoring system can escalate alerts. For example, inresponse to detecting a potential fall and a potential scream or loudnoise, the monitoring system can provide an escalated alert. In someaspects, the monitoring system can provide alerts via different networks(such as Wi-Fi or cellular) and/or technologies (such as Bluetooth).

At block 918, a communications system can be provided. The monitoringsystem can provide a communications system in an elderly care setting.The camera system can receive, from a computing device, first video data(such as, but not limited to, video data of a clinician, friends, orfamily of a patient). The camera system can cause presentation, on thedisplay, of the first video data. The camera system can receive, fromthe camera, second video data and transmit, to the computing device, thesecond video data.

Infant Care Features

Some of the aspects described herein can be directed towards infant carefeatures. The monitoring systems described herein can be applied tomonitoring an infant. FIG. 10 is a flowchart of a method 1000 forefficiently applying machine learning models for infant care, accordingto some aspects of the present disclosure. As described herein, amonitoring system, which can include a camera system, may implementaspects of the method 1000 as described herein. The block(s) of themethod 1000 of FIG. 10 can be similar to the block(s) of the methods800, 900 of FIGS. 8 and/or 9 . The method 1000 may include fewer oradditional blocks and/or the blocks may be performed in order differentthan is illustrated.

Beginning at block 1002, image data can be received. A camera system canreceive image data from a camera, which can be positioned in an infantarea, such as a nursery. Image data can also include, but is not limitedto, a sequence of images. A camera in a camera system can continuouslycapture images of the infant area. Therefore, the camera in a camerasystem can capture images of objects, such as an infant, in a roomeither at a home or a clinical facility.

At block 1006, an infant detection model can be applied. The camerasystem can apply the infant detection model based on the image data. Insome aspects, the camera system can invoke the infant detection model ona hardware accelerator. The infant detection model can be configured toreceive image data as input. The infant detection model can beconfigured to output a classification result. In some aspects, theclassification result can indicate a likelihood (such as a percentagechance) that the image data includes an infant. In other aspects, theclassification result can be a binary result: either the infant objectis predicted as present in the image or not. The infant detection modelcan be, but is not limited to, a CNN. The infant detection model can betrained to detect persons. For example, the infant detection model canbe trained with a training data set with labeled examples indicatingwhether the input data includes an infant or not.

At block 1008, it can be determined whether an infant is present. Thecamera system can determine whether an infant is present. The camerasystem can determine whether an infant object is located in the imagedata. The camera system can receive from the infant detection model theoutput of a classification result. In some aspects, the output can be abinary result, such as, “yes” there is an infant object present or “no”there is not an infant object present. In other aspects, the output canbe a percentage result and the camera system can determine the presenceof an infant if the percentage result is above a threshold. If an infantis detected, the method 1000 proceeds to the block 1010 to receivecaptured data. If an infant is not detected, the method 1000 proceeds torepeat the previous blocks 1002, 1006, 1008 to continue checking forinfants.

At block 1010, captured data can be received. The camera in the camerasystem can continuously capture images, which can lead to the captureddata. In some aspects, the camera system can receive audio data from amicrophone.

At block 1012, one or more infant safety models can be applied. Inresponse to detecting an infant, the camera system can apply one or moreinfant safety models that outputs a model result. The camera system caninvoke (which can be invoked on a hardware accelerator) an infantposition model based on the captured data. The infant position model canoutput a classification result. In some aspects, the infant positionmodel can be or include a CNN. In response to detecting an infant, thecamera system can invoke a facial feature extraction model based onsecond image data where the facial feature extraction model outputs afacial feature vector. The camera system can execute a query of a facialfeatures database based on the facial feature vector where executing thequery indicates that the facial feature vector is not present in thefacial features database. An infant safety model can be an infant colordetection model. In some aspects, the model result can includecoordinates of a boundary region identifying an infant object in theimage data. As described herein, the camera system can invoke a loudnoise detection model based on the audio data where the loud noisedetection model can output a classification result.

At block 1014, it can be determined whether there is an infant safetyissue. The camera system can determine whether there is an infant safetyissue. As described above, for each person safety model that is invoked,the camera system can receive a model result as output. For some models,the output can be a binary result, such as, “yes” an infant is in asupine position or “no” a supine position has not been detected (such asthe infant potentially laying on their stomach). For other models, theoutput can be a percentage result and the camera system can determine aninfant safety issue exists if the percentage result is above athreshold. The camera system can determine that an unrecognized personhas been detected. In some aspects, the camera system determine that thecoordinates of the boundary region exceed a threshold distance from aninfant zone (which can indicate that an infant is being removed from theinfant zone). The camera system can determine a potential scream fromthe model result. In some aspects, evaluation of the one or more infantsafety models can result in an issue detection if at least one modelreturns a result that indicates issue detection. If an infant safetyissue is detected, the method 1000 proceeds to block 1016 to provide analert and/or take an action. If an infant safety issue is not detected,the method 1000 proceeds to repeat the previous blocks 1002, 1006, 1008to continue checking for infants.

At block 1016, an alert can be provided and/or an action can be taken.In some aspects, the camera system can initiate an alert associated withthe infant. The camera system can notify a monitoring system to providean alert. In some aspects, a user computing device 102 can receive analert about an infant safety issue. In some aspects, a clinician 110 canreceive an alert about the infant safety issue. In some aspects, thecamera system can initiate an action. The camera system can cause themonitoring system to take an action. For example, the monitoring systemcan automatically notify emergency services (such as an emergencyhotline and/or an ambulance service) to send someone to help.

At Home Features

Some of the aspects described herein can be directed towards at-homemonitoring features. The monitoring systems described herein can beapplied to monitoring in a home. The monitoring system can accomplishone or more of the following features using the machine learningtechniques described herein. The monitoring system can monitor the timespent on various tasks by members of a household (such as youth athome), such as time spent watching television or time spent studying.The monitoring system can be configured to confirm that certain tasks(such as chores) are completed. In some aspects, the monitoring systemcan allow parents to monitor an amount of time spent using electronics.In some aspects, the camera system can be configured to detect nightterrors and amount and types of sleep. As described herein, in someaspects, the monitoring system can track usage of good hygiene practicesat home, such as but not limited to, handwashing, brushing teeth, orshowering (e.g., tracking that a person enters a bathroom at a showeringtime). As described herein, zones can be used to provide alerts, such asmonitoring a pool zone or other spaces youth should not be allowed, suchas, but not limited to, certain rooms at certain times and/orunaccompanied by an adult. For example, the camera system can monitor agun storage location to alert adults to unauthorized access of weapons.

General Features

Some of the aspects described herein can include any of the followingfeatures, which can be applied in different settings. In some aspects, acamera system can have local storage for an image and/or video feed. Insome aspects, remote access of the local storage may be restrictedand/or limited. In some aspects, the camera system can use a calibrationfactor which can be useful for correcting color drift in the image datafrom a camera. In some aspects, the camera system can add or removefilters on camera to provide certain effects. The camera system mayinclude infrared filters. In some aspects, the monitoring system canmonitor food intake of subject and/or estimate calories. In someaspects, the monitoring system can detect mask wearing (such as wearingor not wearing an oxygen mask).

The monitoring system can perform one or more check-up tests. Themonitoring system, using a machine learning model, can detect slurredspeech, drunkenness, drug use, and/or adverse behavior. Based on othercheck-up tests the monitoring system can detect shaking, microtremors,tremors, which can indicate a potential disease state such asParkinson's. The monitoring system can track exercise movements todetermine a potential physiological condition. A check-up test can beused by the monitoring system for a cognitive assessment, such as,detecting vocabulary decline. In some aspects, the monitoring system cancheck a user's smile where the monitoring system prompts the user tostand a specified distance away from the camera system. A check-up testcan request a subject to do one or more exercise, read something outload(to test muscles of a face), reach for an object. In some aspects, thecamera system can perform an automated physical, perform a hearing test,and/or perform an eye test. In some aspects, a check-up test can be forAlzheimer's detection. The monitoring system can provide memoryexercises, monitor for good/bad days, and/or monitor basic behavior toprevent injury. In some aspects, the camera system can monitor skincolor changes to detect skin damage and/or sunburn detection. The camerasystem can take a trend of skin color, advise or remind to takecorrective action, and/or detect a tan line. The monitoring system canmonitor sleep cycles and/or heart rate variability. In some aspects, themonitoring system can monitor snoring, rapid eye movement (REM), and/orsleep quality, which can be indicative of sleep apnea or anotherdisease. As described herein, the camera system can be tried to detectsleep walking. The camera system can be configured to detect coughing orsneezing to determine potential allergies or illness. The camera systemcan also provide an alert if a possible hyperventilation is detected.Any of the monitoring features described herein can be implemented withthe machine learning techniques described herein.

Additional Implementation Details

FIG. 11 is a block diagram that illustrates example components of acomputing device 1100, which can be a camera system. The computingdevice 1100 can implement aspects of the present disclosure, and, inparticular, aspects of the monitoring system 100A, 100B, such as thecamera system 114. The computing device 1100 can communicate with othercomputing devices.

The computing device 1100 can include a hardware processor 1102, ahardware accelerator, a data storage device 1104, a memory device 1106,a bus 1108, a display 1112, one or more input/output devices 1114, and acamera 1118. A processor 1102 can also be implemented as a combinationof computing devices, e.g., a combination of a digital signal processorand a microprocessor, a plurality of microprocessors, one or moremicroprocessors in conjunction with a digital signal processor, or anyother such configuration. The processor 1102 can be configured, amongother things, to process data, execute instructions to perform one ormore functions, such as apply one or more machine learning models, asdescribed herein. The hardware accelerator 1116 can be special hardwarethat is configured to accelerate machine learning applications. The datastorage device 1104 can include a magnetic disk, optical disk, or flashdrive, etc., and is provided and coupled to the bus 1108 for storinginformation and instructions. The memory 1106 can include one or morememory devices that store data, including without limitation, randomaccess memory (RAM) and read-only memory (ROM). The computing device1100 may be coupled via the bus 1108 to a display 1112, such as an LCDdisplay or touch screen, for displaying information to a user, such as apatient. The computing device 1100 may be coupled via the bus 1108 toone or more input/output devices 1114. The input device 1114 caninclude, but is not limited to, a keyboard, mouse, digital pen,microphone, touch screen, gesture recognition system, voice recognitionsystem, imaging device (which may capture eye, hand, head, or bodytracking data and/or placement), gamepad, accelerometer, or gyroscope.The camera 1118 can include, but is not limited to, a 1080p or 4k cameraand/or an infrared image camera.

Additional Aspects and Terminology

As used herein, the term “patient” can refer to any person that ismonitored using the systems, methods, devices, and/or techniquesdescribed herein. As used herein, a “patient” is not required—to beadmitted to a hospital, rather, the term “patient” can refer to a personthat is being monitored. As used herein, in some cases the terms“patient” and “user” can be used interchangeably.

While some features described herein may be discussed in a specificcontext, such as adult, youth, infant, elderly, or pet care, thosefeatures can be applied to other contexts, such as, but not limited to,a different one of adult, youth, infant, elderly, or pet care contexts.

The apparatuses and methods described herein may be implemented by oneor more computer programs executed by one or more processors. Thecomputer programs include processor-executable instructions that arestored on a non-transitory tangible computer readable medium. Thecomputer programs may also include stored data. Non-limiting examples ofthe non-transitory tangible computer readable medium are nonvolatilememory, magnetic storage, and optical storage.

The term “substantially” when used in conjunction with the term“real-time” forms a phrase that will be readily understood by a personof ordinary skill in the art. For example, it is readily understood thatsuch language will include speeds in which no or little delay or waitingis discernible, or where such delay is sufficiently short so as not tobe disruptive, irritating, or otherwise vexing to a user.

Conditional language used herein, such as, among others, “can,” “might,”“may,” “e.g.,” “for example,” and the like, unless specifically statedotherwise, or otherwise understood within the context as used, isgenerally intended to convey that certain aspects described hereininclude, while other aspects described herein do not include, certainfeatures, elements, or states. Thus, such conditional language is notgenerally intended to imply that features, elements, or states are inany way required for one or more aspects described herein.

Disjunctive language such as the phrase “at least one of X, Y, or Z,”unless specifically stated otherwise, is otherwise understood with thecontext as used in general to present that an item, term, etc., may beeither X, Y, or Z, or any combination thereof (e.g., X, Y, and/or Z).Such disjunctive language is not generally intended to, and should not,imply that certain aspects require at least one of X, at least one of Y,or at least one of Z to each be present. Thus, the term “or” is used inits inclusive sense (and not in its exclusive sense) so that when used,for example, to connect a list of elements, the term “or” means one,some, or all of the elements in the list. Further, the term “each,” asused herein, in addition to having its ordinary meaning, can mean anysubset of a set of elements to which the term “each” is applied.

The term “a” as used herein should be given an inclusive rather thanexclusive interpretation. For example, unless specifically noted, theterm “a” should not be understood to mean “exactly one” or “one and onlyone”; instead, the term “a” means “one or more” or “at least one,”whether used in the claims or elsewhere in the specification andregardless of uses of quantifiers such as “at least one,” “one or more,”or “a plurality” elsewhere in the claims or specification.

The terms “comprising,” “including,” “having,” and the like aresynonymous and are used inclusively, in an open-ended fashion, and donot exclude additional elements, features, acts, operations, and soforth.

While the above detailed description has shown, described, and pointedout novel features as applied to various aspects described herein, itwill be understood that various omissions, substitutions, and changes inthe form and details of the devices or algorithms illustrated can bemade without departing from the spirit of the disclosure. As will berecognized, certain aspects described herein can be embodied within aform that does not provide all of the features and benefits set forthherein, as some features can be used or practiced separately fromothers.

What is claimed is:
 1. A system comprising: a storage device configuredto store instructions; a display; a camera; and a hardware processorconfigured to execute the instructions to: receive a current time;determine to begin a check-up process from the current time; and inresponse to determining to begin the check-up process, causepresentation, on the display, of a prompt to cause a person to perform acheck-up activity, receive, from the camera, image data of a recordingof the check-up activity, invoke a screening machine learning modelbased on the image data, wherein the screening machine learning modeloutputs a classification result, detect a potential screening issuebased on the classification result, and in response to detecting thepotential screening issue, provide an alert.
 2. The system of claim 1,wherein the screening machine learning model is a pupillometry screeningmodel, and wherein the potential screening issue indicates potentialdilated pupils.
 3. The system of claim 1, wherein the screening machinelearning model is a facial paralysis screening model, and wherein thepotential screening issue indicates potential facial paralysis.
 4. Thesystem of claim 1, further comprising a wearable device configured toprocess sensor signals to determine a physiological value for theperson, wherein the hardware processor is configured to execute furtherinstructions to: receive, from the wearable device, the physiologicalvalue; and generate the alert comprising the physiological value.
 5. Thesystem of claim 4, wherein the wearable device comprises a pulseoximetry sensor and the physiological value is for blood oxygensaturation.
 6. The system of claim 4, wherein the wearable device isfurther configured to process the sensor signals to measure at least oneof blood oxygen saturation, pulse rate, perfusion index, respirationrate, heart rate, or pleth variability index.
 7. The system of claim 1,wherein the hardware processor is configured to execute furtherinstructions to: receive, from a second computing device, first videodata; cause presentation, on the display, of the first video data;receive, from the camera, second video data; and transmit, to the secondcomputing device, the second video data.
 8. A method comprising:receiving a current time; determining to begin a check-up process fromthe current time; and in response to determining to begin the check-upprocess, causing presentation, on a display, of a prompt to cause aperson to perform a check-up activity, receiving, from a camera, imagedata of a recording of the check-up activity, invoking a screeningmachine learning model based on the image data, wherein the screeningmachine learning model outputs a model result, detecting a potentialscreening issue based on the model result, and in response to detectingthe potential screening issue, providing an alert.
 9. The method ofclaim 8, wherein the screening machine learning model is a pupillometryscreening model, and wherein the potential screening issue indicatespotential dilated pupils, further comprising: collecting a first set ofimages of dilated pupils; collecting a second set of images withoutdilated pupils; creating a training data set comprising the first set ofimages and the second set of images; and training the pupillometryscreening model using the training data set.
 10. The method of claim 8,wherein the screening machine learning model is a facial paralysisscreening model, and wherein the potential screening issue indicatespotential facial paralysis, further comprising: collecting a first setof images of facial paralysis; collecting a second set of images withoutfacial paralysis; creating a training data set comprising the first setof images and the second set of images; and training the facialparalysis screening model using the training data set.
 11. The method ofclaim 8, wherein the check-up activity comprises a dementia test, andwherein the screening machine learning model comprises a gesturedetection model.
 12. The method of claim 11, wherein the gesturedetection model is configured to detect a gesture directed towards aportion of the display.
 13. The method of claim 8, further comprising:receiving, from the camera, second image data; invoking a persondetection model based on the second image data, wherein the persondetection model outputs first classification result; detect a personbased on the first classification result; receive, from the camera,third image data; and in response to detecting the person, invoking ahandwashing detection model based on the third image data, wherein thehandwashing detection model outputs a second classification result,detecting a potential lack of handwashing based on the secondclassification result, and in response to detecting a lack ofhandwashing, provide a second alert.
 14. A system comprising: a storagedevice configured to store instructions; a camera; and a hardwareprocessor configured to execute the instructions to: receive, from thecamera, first image data; invoke an infant detection model based on thefirst image data, wherein the infant detection model outputs aclassification result; detect an infant based on the classificationresult; receive captured data; and in response to detecting the infant,invoke an infant safety model based on the captured data, wherein theinfant safety model outputs a model result, detect a potential safetyissue based on the model result, and in response to detecting thepotential safety issue, provide an alert.
 15. The system of claim 14,wherein the infant safety model is an infant position model, and whereinthe potential safety issue indicates the infant potentially laying ontheir stomach.
 16. The system of claim 15, wherein the hardwareprocessor is configured to execute further instructions to: receive,from the camera, second image data; and in response to detecting theinfant, invoke a facial feature extraction model based on the secondimage data, wherein the facial feature extraction model outputs a facialfeature vector, execute a query of a facial features database based onthe facial feature vector, wherein executing the query indicates thatthe facial feature vector is not present in the facial featuresdatabase, and in response to determining that the facial feature vectoris not present in the facial features database, provide an unrecognizedperson alert.
 17. The system of claim 14, wherein the infant safetymodel is an infant color detection model, and wherein the potentialsafety issue indicates potential asphyxiation.
 18. The system of claim14, wherein the model result comprises coordinates of a boundary regionidentifying an infant object in the captured data, and wherein detectingthe potential safety issue comprises: determining that the coordinatesof the boundary region exceed a threshold distance from an infant zone.19. The system of claim 14, further comprising a wearable deviceconfigured to process sensor signals to determine a physiological valuefor the infant, wherein the hardware processor is configured to executefurther instructions to: receive, from the wearable device, thephysiological value; and generate the alert comprising the physiologicalvalue.
 20. The system of claim 14, further comprising a microphone,wherein the captured data is received from the microphone, wherein theinfant safety model is a loud noise detection model, and wherein thepotential safety issue indicates a potential scream.